Automatic parallelization of fine-grained metafunctions on a chip multiprocessor

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enabling Parallelization via a Reconfigurable Chip Multiprocessor

While reconfigurable computing has traditionally involved attaching a reconfigurable fabric to a single processor core, the prospect of large-scale CMPs calls for a reevaluation of reconfigurable computing from the perspective of multicore architectures. We present ReMAPP, a reconfigurable architecture geared towards application acceleration and parallelization. In ReMAPP, parallel threads shar...

متن کامل

Fine-Grained Parallelization of a Vlasov-Poisson Application on GPU

Understanding turbulent transport in magnetised plasmas is a subject of major importance to optimise experiments in tokamak fusion reactors. Also, simulations of fusion plasma consume a great amount of CPU time on today’s supercomputers. The Vlasov equation provides a useful framework to model such plasma. In this paper, we focus on the parallelization of a 2D semi-Lagrangian Vlasov solver on G...

متن کامل

Case study of gate-level logic simulation on an extremely fine-grained chip multiprocessor

Explicit-multi-threading (XMT) is a parallel programming approach for exploiting on-chip parallelism. Its fine-grained single program multiple data (SPMD) programming model is suitable for many computing intensive applications. In this paper, we present a parallel gate level logic simulator implemented on an XMT platform and study its performance. Test results show potential for achieving more ...

متن کامل

Fine-grained parallelization of lattice QCD kernel routine on GPUs

Simulation time for the classical problem of Lattice Quantum Chromodynamics (Lattice QCD) is dominated by one kernel routine responsible for computing the actions of a Dirac operator. This paper describes an experience in parallelizing this kernel routine. We explore parallelization granularities for this kernel routine on Graphical Processing Units (GPUs). We show that fine-grained parallelism...

متن کامل

Automatic Compilation to a Coarse-grained Reconfigurable System-on-Chip

The rapid growth of device densities on silicon has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. However, one of the obstacles to the wider acceptance of this technology is its programmability. The application needs to be programmed in hardware description languages or an assembly equivalent, whereas most application programmers are used to the alg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Architecture and Code Optimization

سال: 2013

ISSN: 1544-3566,1544-3973

DOI: 10.1145/2541228.2541237